Overview

Dataset statistics

Number of variables20
Number of observations8124
Missing cells0
Missing cells (%)0.0%
Duplicate rows4528
Duplicate rows (%)55.7%
Total size in memory1.2 MiB
Average record size in memory160.0 B

Variable types

NUM9
BOOL6
CAT5

Reproduction

Analysis started2020-08-25 01:41:17.359208
Analysis finished2020-08-25 01:41:32.272665
Duration14.91 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

veil-type has constant value "0" Constant
Dataset has 4528 (55.7%) duplicate rows Duplicates
cap-shape has 452 (5.6%) zeros Zeros
stalk-color-above-ring has 448 (5.5%) zeros Zeros
gill-color has 408 (5.0%) zeros Zeros
population has 384 (4.7%) zeros Zeros
odor has 400 (4.9%) zeros Zeros
ring-type has 2776 (34.2%) zeros Zeros
cap-color has 2284 (28.1%) zeros Zeros
habitat has 2148 (26.4%) zeros Zeros
stalk-root has 2480 (30.5%) zeros Zeros

Variables

cap-shape
Real number (ℝ≥0)

ZEROS

Distinct count6
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4918759231905465
Minimum0
Maximum5
Zeros452
Zeros (%)5.6%
Memory size63.6 KiB
2020-08-25T01:41:32.325117image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median2
Q33
95-th percentile4
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9012871786
Coefficient of variation (CV)0.3616902311
Kurtosis1.371745853
Mean2.491875923
Median Absolute Deviation (MAD)1
Skewness-0.6195585645
Sum20244
Variance0.8123185782
2020-08-25T01:41:32.432531image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2365645.0%
 
3315238.8%
 
482810.2%
 
04525.6%
 
5320.4%
 
14< 0.1%
 
ValueCountFrequency (%) 
04525.6%
 
14< 0.1%
 
2365645.0%
 
3315238.8%
 
482810.2%
 
5320.4%
 
ValueCountFrequency (%) 
5320.4%
 
482810.2%
 
3315238.8%
 
2365645.0%
 
14< 0.1%
 
04525.6%
 

stalk-color-above-ring
Real number (ℝ≥0)

ZEROS

Distinct count9
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4465780403742
Minimum0
Maximum8
Zeros448
Zeros (%)5.5%
Memory size63.6 KiB
2020-08-25T01:41:32.544074image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median7
Q37
95-th percentile7
Maximum8
Range8
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.143900327
Coefficient of variation (CV)0.3936233561
Kurtosis0.5917901093
Mean5.44657804
Median Absolute Deviation (MAD)0
Skewness-1.301345538
Sum44248
Variance4.596308614
2020-08-25T01:41:32.659925image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
7446454.9%
 
5187223.0%
 
35767.1%
 
04485.5%
 
14325.3%
 
41922.4%
 
6961.2%
 
2360.4%
 
880.1%
 
ValueCountFrequency (%) 
04485.5%
 
14325.3%
 
2360.4%
 
35767.1%
 
41922.4%
 
5187223.0%
 
6961.2%
 
7446454.9%
 
880.1%
 
ValueCountFrequency (%) 
880.1%
 
7446454.9%
 
6961.2%
 
5187223.0%
 
41922.4%
 
35767.1%
 
2360.4%
 
14325.3%
 
04485.5%
 

gill-color
Real number (ℝ≥0)

ZEROS

Distinct count12
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.729443623830625
Minimum0
Maximum11
Zeros408
Zeros (%)5.0%
Memory size63.6 KiB
2020-08-25T01:41:32.778340image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median4
Q37
95-th percentile10
Maximum11
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.342401907
Coefficient of variation (CV)0.7067220107
Kurtosis-1.32490759
Mean4.729443624
Median Absolute Deviation (MAD)3
Skewness0.3387940788
Sum38422
Variance11.17165051
2020-08-25T01:41:32.884343image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2172821.3%
 
7149218.4%
 
10120214.8%
 
1104812.9%
 
47529.3%
 
37329.0%
 
84926.1%
 
04085.0%
 
9961.2%
 
11861.1%
 
6640.8%
 
5240.3%
 
ValueCountFrequency (%) 
04085.0%
 
1104812.9%
 
2172821.3%
 
37329.0%
 
47529.3%
 
5240.3%
 
6640.8%
 
7149218.4%
 
84926.1%
 
9961.2%
 
ValueCountFrequency (%) 
11861.1%
 
10120214.8%
 
9961.2%
 
84926.1%
 
7149218.4%
 
6640.8%
 
5240.3%
 
47529.3%
 
37329.0%
 
2172821.3%
 

cap-surface
Categorical

Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
2
3244
3
2556
0
2320
1
 
4
ValueCountFrequency (%) 
2324439.9%
 
3255631.5%
 
0232028.6%
 
14< 0.1%
 
2020-08-25T01:41:33.060345image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
2324439.9%
 
3255631.5%
 
0232028.6%
 
14< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8124100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
2324439.9%
 
3255631.5%
 
0232028.6%
 
14< 0.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8124100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
2324439.9%
 
3255631.5%
 
0232028.6%
 
14< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8124100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
2324439.9%
 
3255631.5%
 
0232028.6%
 
14< 0.1%
 

veil-type
Boolean

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
0
8124
ValueCountFrequency (%) 
08124100.0%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
1
7914
0
 
210
ValueCountFrequency (%) 
1791497.4%
 
02102.6%
 

population
Real number (ℝ≥0)

ZEROS

Distinct count6
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.6440177252584935
Minimum0
Maximum5
Zeros384
Zeros (%)4.7%
Memory size63.6 KiB
2020-08-25T01:41:33.185771image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median4
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.25208182
Coefficient of variation (CV)0.3435992671
Kurtosis1.676557918
Mean3.644017725
Median Absolute Deviation (MAD)1
Skewness-1.413095676
Sum29604
Variance1.567708884
2020-08-25T01:41:33.315385image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4404049.7%
 
5171221.1%
 
3124815.4%
 
24004.9%
 
03844.7%
 
13404.2%
 
ValueCountFrequency (%) 
03844.7%
 
13404.2%
 
24004.9%
 
3124815.4%
 
4404049.7%
 
5171221.1%
 
ValueCountFrequency (%) 
5171221.1%
 
4404049.7%
 
3124815.4%
 
24004.9%
 
13404.2%
 
03844.7%
 
Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
3
5176
2
2372
0
 
552
1
 
24
ValueCountFrequency (%) 
3517663.7%
 
2237229.2%
 
05526.8%
 
1240.3%
 
2020-08-25T01:41:33.494664image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
3517663.7%
 
2237229.2%
 
05526.8%
 
1240.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8124100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
3517663.7%
 
2237229.2%
 
05526.8%
 
1240.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8124100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
3517663.7%
 
2237229.2%
 
05526.8%
 
1240.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8124100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
3517663.7%
 
2237229.2%
 
05526.8%
 
1240.3%
 

bruises?
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
1
4748
0
3376
ValueCountFrequency (%) 
1474858.4%
 
0337641.6%
 

odor
Real number (ℝ≥0)

ZEROS

Distinct count9
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.788281634662728
Minimum0
Maximum8
Zeros400
Zeros (%)4.9%
Memory size63.6 KiB
2020-08-25T01:41:33.607694image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q14
median6
Q36
95-th percentile8
Maximum8
Range8
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.983678459
Coefficient of variation (CV)0.4142777328
Kurtosis0.07008745619
Mean4.788281635
Median Absolute Deviation (MAD)2
Skewness-0.726370513
Sum38900
Variance3.93498023
2020-08-25T01:41:33.728285image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
6352843.4%
 
4216026.6%
 
35767.1%
 
85767.1%
 
14004.9%
 
04004.9%
 
72563.2%
 
21922.4%
 
5360.4%
 
ValueCountFrequency (%) 
04004.9%
 
14004.9%
 
21922.4%
 
35767.1%
 
4216026.6%
 
5360.4%
 
6352843.4%
 
72563.2%
 
85767.1%
 
ValueCountFrequency (%) 
85767.1%
 
72563.2%
 
6352843.4%
 
5360.4%
 
4216026.6%
 
35767.1%
 
21922.4%
 
14004.9%
 
04004.9%
 

ring-number
Categorical

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
1
7488
2
 
600
0
 
36
ValueCountFrequency (%) 
1748892.2%
 
26007.4%
 
0360.4%
 
2020-08-25T01:41:33.913423image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
1748892.2%
 
26007.4%
 
0360.4%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8124100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
1748892.2%
 
26007.4%
 
0360.4%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8124100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
1748892.2%
 
26007.4%
 
0360.4%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8124100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
1748892.2%
 
26007.4%
 
0360.4%
 
Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
3
4936
2
2304
0
 
600
1
 
284
ValueCountFrequency (%) 
3493660.8%
 
2230428.4%
 
06007.4%
 
12843.5%
 
2020-08-25T01:41:34.086741image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
3493660.8%
 
2230428.4%
 
06007.4%
 
12843.5%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8124100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
3493660.8%
 
2230428.4%
 
06007.4%
 
12843.5%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8124100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
3493660.8%
 
2230428.4%
 
06007.4%
 
12843.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8124100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
3493660.8%
 
2230428.4%
 
06007.4%
 
12843.5%
 

ring-type
Real number (ℝ≥0)

ZEROS

Distinct count5
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.291974396848843
Minimum0
Maximum4
Zeros2776
Zeros (%)34.2%
Memory size63.6 KiB
2020-08-25T01:41:34.202621image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q34
95-th percentile4
Maximum4
Range4
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.801672002
Coefficient of variation (CV)0.7860785899
Kurtosis-1.708767354
Mean2.291974397
Median Absolute Deviation (MAD)2
Skewness-0.2900182044
Sum18620
Variance3.246022003
2020-08-25T01:41:34.317686image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4396848.8%
 
0277634.2%
 
2129616.0%
 
1480.6%
 
3360.4%
 
ValueCountFrequency (%) 
0277634.2%
 
1480.6%
 
2129616.0%
 
3360.4%
 
4396848.8%
 
ValueCountFrequency (%) 
4396848.8%
 
3360.4%
 
2129616.0%
 
1480.6%
 
0277634.2%
 

veil-color
Categorical

Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
2
7924
1
 
96
0
 
96
3
 
8
ValueCountFrequency (%) 
2792497.5%
 
1961.2%
 
0961.2%
 
380.1%
 
2020-08-25T01:41:34.504522image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
2792497.5%
 
0961.2%
 
1961.2%
 
380.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number8124100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
2792497.5%
 
0961.2%
 
1961.2%
 
380.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Common8124100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
2792497.5%
 
0961.2%
 
1961.2%
 
380.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8124100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
2792497.5%
 
0961.2%
 
1961.2%
 
380.1%
 

cap-color
Real number (ℝ≥0)

ZEROS

Distinct count10
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.323485967503693
Minimum0
Maximum9
Zeros2284
Zeros (%)28.1%
Memory size63.6 KiB
2020-08-25T01:41:34.616155image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3
Q38
95-th percentile9
Maximum9
Range9
Interquartile range (IQR)8

Descriptive statistics

Standard deviation3.444390875
Coefficient of variation (CV)0.7966698401
Kurtosis-1.595954042
Mean4.323485968
Median Absolute Deviation (MAD)3
Skewness-0.01618156112
Sum35124
Variance11.8638285
2020-08-25T01:41:34.719686image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0228428.1%
 
3184022.6%
 
7150018.5%
 
9107213.2%
 
8104012.8%
 
11682.1%
 
51441.8%
 
2440.5%
 
6160.2%
 
4160.2%
 
ValueCountFrequency (%) 
0228428.1%
 
11682.1%
 
2440.5%
 
3184022.6%
 
4160.2%
 
51441.8%
 
6160.2%
 
7150018.5%
 
8104012.8%
 
9107213.2%
 
ValueCountFrequency (%) 
9107213.2%
 
8104012.8%
 
7150018.5%
 
6160.2%
 
51441.8%
 
4160.2%
 
3184022.6%
 
2440.5%
 
11682.1%
 
0228428.1%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
1
4608
0
3516
ValueCountFrequency (%) 
1460856.7%
 
0351643.3%
 

habitat
Real number (ℝ≥0)

ZEROS

Distinct count7
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2210733628754307
Minimum0
Maximum6
Zeros2148
Zeros (%)26.4%
Memory size63.6 KiB
2020-08-25T01:41:35.011023image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3
Q36
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)6

Descriptive statistics

Standard deviation2.530691874
Coefficient of variation (CV)0.785667257
Kurtosis-1.673747886
Mean3.221073363
Median Absolute Deviation (MAD)3
Skewness-0.09599061819
Sum26168
Variance6.404401359
2020-08-25T01:41:35.121950image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
6314838.7%
 
0214826.4%
 
3114414.1%
 
183210.2%
 
43684.5%
 
22923.6%
 
51922.4%
 
ValueCountFrequency (%) 
0214826.4%
 
183210.2%
 
22923.6%
 
3114414.1%
 
43684.5%
 
51922.4%
 
6314838.7%
 
ValueCountFrequency (%) 
6314838.7%
 
51922.4%
 
43684.5%
 
3114414.1%
 
22923.6%
 
183210.2%
 
0214826.4%
 

gill-size
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
0
5612
1
2512
ValueCountFrequency (%) 
0561269.1%
 
1251230.9%
 

stalk-root
Real number (ℝ≥0)

ZEROS

Distinct count5
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1097981290004924
Minimum0
Maximum4
Zeros2480
Zeros (%)30.5%
Memory size63.6 KiB
2020-08-25T01:41:35.236572image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile3
Maximum4
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.061106068
Coefficient of variation (CV)0.9561252991
Kurtosis0.08976099331
Mean1.109798129
Median Absolute Deviation (MAD)1
Skewness0.9478523612
Sum9016
Variance1.125946088
2020-08-25T01:41:35.349949image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1377646.5%
 
0248030.5%
 
3112013.8%
 
25566.8%
 
41922.4%
 
ValueCountFrequency (%) 
0248030.5%
 
1377646.5%
 
25566.8%
 
3112013.8%
 
41922.4%
 
ValueCountFrequency (%) 
41922.4%
 
3112013.8%
 
25566.8%
 
1377646.5%
 
0248030.5%
 

target
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size63.6 KiB
0
4208
1
3916
ValueCountFrequency (%) 
0420851.8%
 
1391648.2%
 

Interactions

2020-08-25T01:41:18.732626image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:18.867875image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:19.027836image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:19.165420image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:19.300133image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:19.445646image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:19.597286image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:19.741267image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:19.881032image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:20.030314image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:20.182523image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:20.345280image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:20.495949image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:20.646612image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:20.805600image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:20.969018image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:21.292727image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:21.448581image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:21.615740image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:21.750206image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:21.897818image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:22.039988image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:22.177739image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:22.322133image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:22.470347image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:22.603996image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:22.743954image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:22.894742image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:23.029819image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:23.188606image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:23.328473image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:23.465698image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:23.613042image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:23.762012image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:23.906037image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:24.045668image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:24.200142image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:24.348968image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:24.512018image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:24.660836image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:24.809488image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:24.966622image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:25.128654image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:25.271921image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:25.430999image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:25.592016image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:25.742192image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:26.102466image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:26.255007image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:26.406621image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:26.573022image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:26.741078image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:26.888725image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:27.042659image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:27.208785image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:27.336687image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:27.477927image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:27.606698image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:27.739995image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:27.878596image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:28.020481image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:28.148939image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:28.282087image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:28.428264image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:28.570873image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:28.739477image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:28.887257image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:29.031051image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:29.185511image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:29.337970image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:29.478003image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:29.620275image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:29.781598image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:29.932254image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:30.107904image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:30.269311image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:30.421297image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:30.580618image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:30.938373image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:31.085606image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:31.241278image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-08-25T01:41:35.517045image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-08-25T01:41:35.869439image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-08-25T01:41:36.222016image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-08-25T01:41:36.585599image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-08-25T01:41:36.882782image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-08-25T01:41:31.559244image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-08-25T01:41:32.065468image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

cap-shapestalk-color-above-ringgill-colorcap-surfaceveil-typegill-attachmentpopulationstalk-surface-above-ringbruises?odorring-numberstalk-surface-below-ringring-typeveil-colorcap-colorstalk-shapehabitatgill-sizestalk-roottarget
023800143061342716010
107430123001342900020
227030130161302810030
320420142141222303011
437800153061342016010
527020143071342804131
647230143181302011101
7361020113062302505000
820400152141222903011
935400142141222300011

Last rows

cap-shapestalk-color-above-ringgill-colorcap-surfaceveil-typegill-attachmentpopulationstalk-surface-above-ringbruises?odorring-numberstalk-surface-below-ringring-typeveil-colorcap-colorstalk-shapehabitatgill-sizestalk-roottarget
811427100100161002310030
811535720142141222900011
811627120153011142003040
8117271030143071342000131
811847230142181302016101
811935220142181202011101
812037020133071342804131
812127400122162342800000
812227430133162242800000
812337220142131302016101

Duplicate rows

Most frequent

cap-shapestalk-color-above-ringgill-colorcap-surfaceveil-typegill-attachmentpopulationstalk-surface-above-ringbruises?odorring-numberstalk-surface-below-ringring-typeveil-colorcap-colorstalk-shapehabitatgill-sizestalk-roottargetcount
288231001430613420160106
289231001430613423160106
290231001430613427160106
291231001530613420160106
292231001530613423160106
293231001530613427160106
294231201430613420160106
295231201430613423160106
296231201430613427160106
297231201530613420160106